Senior Cloud Site Reliability Engineer
Req ID: 26-321
Come join our passionate team! Barracuda is a leading cybersecurity company providing complete protection against complex threats. Our platform protects email, data, applications, and networks with innovative solutions, and a managed XDR service, to strengthen cyber resilience. Hundreds of thousands of IT professionals and managed service providers worldwide trust us to protect and support them with solutions that are easy to buy, deploy, and use.
We are committed to a candidate selection process and work environment that is inclusive and barrier free. To ensure candidates are assessed in a fair and equitable manner, accommodations will be provided to prospective employees in accordance with the Accessibility for Ontarians with Disabilities Act (AODA) and the Ontario Human Rights Code.
Envision yourself at Barracuda
We seek a passionate and experienced Senior Cloud Site Reliability Engineer / (SRE) for the Email Protection business unit with great technical acumen and a strong background in operations, automation, implementation, and development. You will be responsible for ensuring the availability of high volume, critical SaaS applications and seamless scaling. The application portfolio ranges from a broad spectrum of Email Protection products.
What will you be working on:
- Application Infrastructure Support: Work with internal customers to understand application design and cloud infrastructure requirements, focusing on scalability and reliability
- Infrastructure Automation: Implement templates, tools, and scripts for infrastructure deployment to support development teams
- Platform Support: Help develop and maintain self-service platforms for Product Engineering team
- Service Level Management: Implement and monitor SLIs, SLOs, and SLAs across services
- Incident Management: Participate in incident response processes and contribute to post-incident reviews
- Disaster Recovery: Help maintain disaster recovery and business continuity plans
- Technical Implementation: Implement non-functional requirements including security, performance, and monitoring
- Solution Implementation: Assist with architecture implementation, solution design, and code reviews
- Technology Stack Implementation: Implement solutions using AWS, Kubernetes, GitHub Actions, Jenkins, Terraform, and other current technologies
- Deployment Automation: Support initiatives to convert manual deployments to automated processes
- Observability Systems: Maintain and enhance monitoring and reliability systems
- On-Call Duties: Participate in on-call rotation to ensure 24/7 system reliability
What you bring to the role:
- Technical Expertise: 5+ years hands-on infrastructure experience, including 3+ years cloud development and SRE/DevOps roles
- Cloud Infrastructure: Strong knowledge of AWS cloud infrastructure, security, and operations in production environments
- Infrastructure as Code: Experience with Terraform, CloudFormation, or Pulumi for cloud infrastructure automation
- CI/CD & Automation: Experience with GitHub, GitHub Actions, Jenkins, and configuration management tools
- Deployment Patterns: Knowledge of blue/green, canary, and rolling deployment strategies
- Container Orchestration: Experience with Docker, Kubernetes, and EKS in AWS environments
- Programming: Solid coding abilities in Python, Go, or similar languages
- Operating Systems: Strong Linux knowledge including system administration
- Observability: Experience with monitoring tools like New Relic, CloudWatch, Prometheus, and Grafana
- Problem Solving: Good debugging and troubleshooting capabilities
- Certifications: AWS certifications (Solutions Architect, SysOps) or Kubernetes certifications (CKA, CKAD) a plus
What you’ll get from us:
A team where you can voice your opinion, make an impact, and where you and your experience are valued. Internal mobility – there are opportunities for cross training and the ability to attain your next career step within Barracuda. In addition, you will receive equity, in the form of non-qualifying options.
#LI-remote